Labelled data bank of spoken standard German the Kiel corpus of read/spontaneous speech
نویسنده
چکیده
This paper outlines the successive steps in the setting up of a labelled data bank of German read and spontaneous speech at IPDS Kiel.
منابع مشابه
German Today: a really extensive Corpus of Spoken Standard German
The research project “German Today” aims to determine the amount of regional variation in (near-)standard German spoken by young and older educated adults and to identify and locate regional features. To this end, we compile an areally extensive corpus of read and spontaneous German speech. Secondary school students and 50-to-60-year-old locals are recorded in 160 cities throughout the German s...
متن کاملGerman Today: An Extensive Speech Data Collection in the German Speaking Area of Europe
The research project “German Today” aims to determine the amount of regional variation in (near-) standard German spoken by young and older educated adults, and to identify and locate the regional features. To this end, an extensive corpus of read and spontaneous speech is currently being compiled. German is a so-called pluricentric language. With our corpus we aim to determine whether national...
متن کاملCorpus Based Evaluation of Entropy Rate Speech Segmentation
The sequence of estimates of the speech signal’s entropy rate is investigated as a potential basis for speech segmentation. Raising and falling edges of that entropy rate curve and its maxima and minima are considered as candidates for segment boundaries. These prominent points are compared to the phonetic segment boundaries and to acoustic landmarks. The comparison is made using the American T...
متن کاملFOLK-Gold ― A Gold Standard for Part-of-Speech-Tagging of Spoken German
In this paper, we present a GOLD standard of part-of-speech tagged transcripts of spoken German. The GOLD standard data consists of four annotation layers – transcription (modified orthography), normalization (standard orthography), lemmatization and POS tags – all of which have undergone careful manual quality control. It comes with guidelines for the manual POS annotation of transcripts of Ge...
متن کاملEntropy Rate-based Stationary / Non-stationary Segmentation of Speech
This study evaluates the potential of the entropy rate contour to identify stationary and non-stationary segments of speech signals. The segmentation produced by an entropy rate-based method is compared to the manual phoneme segmentations of the TIMIT and the KIEL corpora. Characteristic points, i.e. steepest rises and falls of the entropy rate curve and its maxima and minima are investigated t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996